AITopics | question 1

Collaborating Authors

question 1

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Testable and Actionable Calibration for Full Swap Regret

Bairaktari, Konstantina, Hu, Lunjia, Nguyen, Huy L., Ullman, Jonathan

arXiv.org Machine LearningMay-19-2026

AI generated predictions increasingly inform decision making in critical tasks, and therefore must be trustworthy. One widely used measure of trustworthiness is calibration, which requires that the predictions match the true frequencies and can be treated like real probabilities of a given outcome. However, defining calibration is subtle, and designing good measures of calibration error has been an active topic of recent research. The first goal is to find calibration measures that are actionable, meaning they can inform decision makers about their utility loss when predictions are treated as true probabilities, which is known as swap regret. The second goal is to find calibration measures that are testable, meaning that calibration error can be measured from a small sample of predictions and outcomes. Although these are very basic requirements, there is no existing calibration measure that fully satisfies both properties, and all existing measures relax actionability by bounding a weaker notion of swap regret, or relax testability by having suboptimal estimation error. We introduce a new calibration measure, Soft-Binned Calibration Decision Loss (SCDL), which we prove is fully actionable without weakening either requirement, and testable with nearly optimal error rate. In addition, SCDL satisfies other desired properties such as continuity and consistency. We also provide a set of experiments confirming that the theoretical advantages of SCDL compared to other measures lead to better performance in practice.

artificial intelligence, machine learning, scdl, (16 more...)

arXiv.org Machine Learning

2605.17749

Country: North America > United States (0.46)

Genre: Research Report > New Finding (0.45)

Technology: Information Technology > Artificial Intelligence > Machine Learning (1.00)

Add feedback

Sharpness Minimization Algorithms Do Not Only Minimize Sharpness To Achieve Better Generalization

Neural Information Processing SystemsApr-24-2026, 06:24:20 GMT

Despite extensive studies, the underlying reason as to why overparameterized neural networks can generalize remains elusive. Existing theory shows that common stochastic optimizers prefer flatter minimizers of the training loss, and thus a natural potential explanation is that flatness implies generalization. This work critically examines this explanation. Through theoretical and empirical investigation, we identify the following three scenarios for two-layer ReLU networks: (1) flatness provably implies generalization; (2) there exist non-generalizing flattest models and sharpness minimization algorithms fail to generalize poorly, and (3) perhaps most strikingly, there exist non-generalizing flattest models, but sharpness minimization algorithms still generalize. Our results suggest that the relationship between sharpness and generalization subtly depends on the data distributions and the model architectures and sharpness minimization algorithms do not only minimize sharpness to achieve better generalization. This calls for the search for other explanations for the generalization of over-parameterized neural networks.

artificial intelligence, generalization, machine learning, (14 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (1.00)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

18561617ca0b4ffa293166b3186e04b0-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 16:44:04 GMT

However, foundational theoretical questions about this algorithm's privacy loss remain open--even in the seemingly simple setting of smooth convex losses over a bounded domain. Our main result resolves these questions: for a large range of parameters, we characterize the differential privacy up to a constant.

artificial intelligence, machine learning, privacy loss, (16 more...)

Neural Information Processing Systems

Country:

North America > United States > Wisconsin > Dane County > Madison (0.04)
North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)

Industry: Information Technology > Security & Privacy (0.46)

Technology:

Information Technology > Artificial Intelligence > Machine Learning (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.69)

Add feedback

1091660f3dff84fd648efe31391c5524-AuthorFeedback.pdf

Neural Information Processing SystemsFeb-7-2026, 12:25:53 GMT

multi-head strategy, reviewer, selection, (12 more...)

Neural Information Processing Systems

Genre: Research Report (0.56)

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.32)

Add feedback

0354767c6386386be17cabe4fc59711b-Paper-Conference.pdf

Neural Information Processing SystemsFeb-7-2026, 07:08:15 GMT

arxiv preprint arxiv, generalization, sharpness, (12 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Santa Clara County > Palo Alto (0.04)
North America > United States > Pennsylvania > Allegheny County > Pittsburgh (0.04)
Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)

Genre: Research Report > New Finding (0.69)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.70)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Investigating the Impact of Rationales for LLMs on Natural Language Understanding

Shi, Wenhang, Bian, Shuqing, Chen, Yiren, Zhang, Xinyi, Zhao, Zhe, Hu, Pengfei, Lu, Wei, Du, Xiaoyong

arXiv.org Artificial IntelligenceOct-21-2025

Chain-of-thought (CoT) rationales, which provide step-by-step reasoning to derive final answers, benefit LLMs in both inference and training. Incorporating rationales, either by generating them before answering during inference, or by placing them before or after the original answers during training - significantly improves model performance on mathematical, symbolic and commonsense reasoning tasks. However, most work focuses on the role of rationales in these reasoning tasks, overlooking their potential impact on other important tasks like natural language understanding (NLU) tasks. In this work, we raise the question: Can rationales similarly benefit NLU tasks? To conduct a systematic exploration, we construct NLURC, a comprehensive and high-quality NLU dataset collection with rationales, and develop various rationale-augmented methods. Through exploring the applicability of these methods on NLU tasks using the dataset, we uncover several potentially surprising findings: (1) CoT inference shifts from hindering NLU performance to surpassing direct label prediction as model size grows, indicating a positive correlation. (2) Most rationale-augmented training methods perform worse than label-only training, with one specially designed method consistently achieving improvements. (3) LLMs trained with rationales achieve significant performance gains on unseen NLU tasks, rivaling models ten times their size, while delivering interpretability on par with commercial LLMs.

large language model, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2510.16686

Country:

North America > United States (1.00)
Europe (1.00)
Asia (1.00)

Genre: Research Report > New Finding (0.46)

Industry: Education (0.94)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.67)

Add feedback

1091660f3dff84fd648efe31391c5524-AuthorFeedback.pdf

Neural Information Processing SystemsOct-9-2025, 13:14:06 GMT

multi-head strategy, reviewer, selection, (11 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.32)

Add feedback

58af908d6293810f1a29e69bf723dc48-Supplemental-Datasets_and_Benchmarks.pdf

Neural Information Processing SystemsOct-8-2025, 18:00:38 GMT

We also provide the ground truth object masks. Question 2: How many instances are there in total of each type?

artificial intelligence, dataset, machine learning, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.68)

Industry:

Law (0.46)
Information Technology (0.46)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.46)

Add feedback

939314105ce8701e67489642ef4d49e8-AuthorFeedback.pdf

Neural Information Processing SystemsAug-15-2025, 04:03:39 GMT

We answer your main questions as follows. "Is there any hope to avoid the We will add a remark in the paper to discuss this point more thoroughly. Question 2. "Technically, I think in order for Lemma 4 to hold, f needs to be defined on the whole vector space" The issue has also been identified by Reviewer #3. We will improve the paper writing to make this point more clear. Question 2. "what regret ... if ... only access to 1 gradient query per step, rather than the two used in OEGD." We address your main questions as follows. Question 1. "how would the lower-bound of function appear in your bounds if we assume they are not positive" Question 2. "how would the algorithms / results change if 0 is not in X?" Answer 2. There are three places we use this assumption: About the self-bounding property of smooth functions, you are absolutely correct. For other minor issues, we will carefully revise the paper according to your constructive comments. Below we address your concerns and clarify the misunderstandings. Question 2. "The novelty of the paper is limited.

algorithm, question 2, reviewer, (13 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Machine Learning (0.35)

Add feedback